Statistical significance of MUC-6 results

نویسنده

  • Nancy Chinchor
چکیده

The results of the MUC-6 evaluation must be analyzed to determine whether close scores significantl y distinguish systems or whether the differences in those scores are a matter of chance. In order to do such an analysis , a method of computer intensive hypothesis testing was developed by SAIC for the MUC-3 results and has been use d for distinguishing MUC scores since that time . The implementation of this method for the MUC evaluations was firs t described in [1] and later the concepts behind the statistical model were explained in a more understandable manne r in [2] . This paper gives the results of the statistical testing for the three MUC-6 tasks where a single metric could b e associated with a system's performance .

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The statistical significance of the MUC-5 results

The statistical significance of the results of the MUC-5 evaluation is determined using a computer-intensiv e method of hypothesis testing known as approximate randomization . The exact method is described in detail in 111 an d [2] and has been used as the accepted statistical test for the MUC results since MUC-3 . The purpose of the statistica l testing is to determine whether the scores of th...

متن کامل

Evaluating Message Understanding Systems: An Analysis of the Third Message Understanding Conference (MUC-3)

This paper describes and analyzes the results of the Third Message Understanding Conference (MUC-3). It reviews the purpose, history, and methodology of the conference, summarizes the participating systems, discusses issues of measuring system effectiveness, describes the linguistic phenomena tests, and provides a critical look at the evaluation in terms of the lessons learned. One of the commo...

متن کامل

The statistical significance of the MUC-4 results

The MUC-4 scores of recall, precision, and the F-measures are used to measure the performance of the participating systems. The differences in the scores between any two systems may be due to chance or may be due to a significant difference between the two systems. To rule out the possibility that the difference is due to chance, statistical hypothesis testing is used. The method of hypothesis ...

متن کامل

Reactive ring-opened aldehyde metabolites in benzene hematotoxicity.

The hematotoxicity of benzene is mediated by reactive benzene metabolites and possibly by other intermediates including reactive oxygen species. We previously hypothesized that ring-opened metabolites may significantly contribute to benzene hematotoxicity. Consistent with this hypothesis, our studies initially demonstrated that benzene is metabolized in vitro to trans-trans-muconaldehyde (MUC),...

متن کامل

BEN: description of the PLUM system as used for MUC-6

ABSTRAC T This paper provides a quick summary of our technical approach, which has been developing since 1991 and wa s first fielded in MUC-3. First a quick review of what is new is provided, then a walkthrough of system components. Perhaps most interesting is out analysis, following the walkthrough, of what we learned through MUC-6 and o f what directions we would take now to break the perform...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995